1) Rationale and Research Questions


The northeast region of the continental United States is an area of high population density, a pattern that persists among various states including Delaware. The Delaware Water Gap National Recreation Area is unique in that it is situated between two major interstates, the I-80 and the I-84. With a growing global population and sprawling urbanization, understanding the influence of development within close proximity to national protected areas is important for future urban planning. For this analysis, we chose to focus on point count data for four species of woodpeckers: downy woodpecker, pileated woodpecker, red-bellied woodpecker, and hairy woodpecker. Since woodpeckers reside in Delaware year-round, they are an appropriate species to chose when focusing on local environmental variables over time.

Ambient noise levels, the presence of road noise, and the hemlock tree condition score data are used as possible explanatory variables for the detection count of the woodpecker species. Ambient noise levels and the presence of road noise are used as indicators of human influence. Hemlock tree condition scores are representative of the condition of hemlocks, as woodpeckers have been known to exploit hemlock tree snags for nesting.This study will investigate the statistical significance of these possible explanatory variables on woodpecker detection count as well as the spatial variation in these variables.


Image 1. Pileated Woodpecker
Image 1. Pileated Woodpecker


Image 2. Hairy Woodpecker
Image 2. Hairy Woodpecker


Image 3. Downy Woodpecker
Image 3. Downy Woodpecker


Image 4. Red-Bellied Woodpecker
Image 4. Red-Bellied Woodpecker


2) Research Questions


Question 1: Is there a visual correlation between the possible explanatory variables (ambient noise, road noise, and hemlock condition) and the detection of woodpecker species in the Delaware Water Gap from 2014-2025?

Question 2: How does the appearance of woodpecker species, ambient noise, road noise, and hemlock condition vary spatially?

Question 3: Is there a statistical significance between the possible explanatory variables (ambient noise, road noise, and hemlock condition) and the detection of woodpecker species in the Delaware Water Gap from 2014-2025?

3) Dataset Information


The dataset used in this analysis was downloaded from the Data.gov federal database and is originally published from the National Park Service (NPS). The data was gathered as a part of the Eastern Rivers and Mountains Network Streamside Bird Monitoring Protocol. The dataset includes avian point counts from the following six different National Parks:

  • Allegheny Portage Railroad National Historic Site
  • Bluestone National Scenic River
  • Delaware Water Gap National Recreation Area
  • Fort Necessity National Battlefield
  • Friendship Hill National Historic Site
  • New River Gorge National Park and Preserve

The data was collected during the months of May and June from 2011-2025. Each park consisted of multiple sites with each site having three point count stations. The sites were sampled repeatedly, typically four times annually. Point counts were conducted for ten minutes each, providing ample data on the bird species detected.Additional environmental factors such as ambient noise, road noise, and hemlock conditions were noted for each individual point count station by date of observation. The ambient noise was recorded in decibels through a handheld sensor, with values between 30-100 (dB). Road noise was recorded as a true or false status, with true denoting that there was evidence from the observer to suggest that road noise was present. Hemlock condition score was assigned by the observer with zero denoting that there was no hemlock trees observed within a 50-meter radius of the point count station. The scale of one to four denoted the health of the hemlocks observed with increasing decay for increased condition score.

Table 1: Dataset Information
Information Description
Data Publisher Department of the Interior - National Park Service
Data Source https://catalog.data.gov/dataset/streamside-bird-monitoring-data-in-eastern-rivers-and-mountains-network-parks-2011-2025-da
Variables Used Unit Code, Site Name, Start Date ISO, Ambient Noise, Road Noise, Common Name, Hemlock Condition Score, Latitude, Longitude
Data Range May 27th, 2011 to May 19th, 2025

4) Data Wrangling


After importing the dataset, the date column was properly classified as a date and a year column was created for easier visual grouping. Unnecessary columns were removed from the data frame and it was filtered to only include bird species that were identified as the four woodpeckers in the dataset within the Delaware Water Gap National Recreation Area. Columns with unintuitive names were renamed. After this, the main data frame was considered wrangled. Further wrangling from this man data frame was used to create data frames that grouped confounding variables associated with location and time with the woodpecker species and possible explanatory variables. This created detection counts for the woodpecker species dependent on environmental factors.

The data frames were used to create averages of the ambient noise and hemlock condition scores per year and per site as well as the yearly road noise status per species. Additional data frames were created with the possible explanatory variables and their associated coordinates for mapping analysis.

5) Exploratory Analysis


The exploratory analysis of our data set is comprised of several plots and spatial map objects that relate our explanatory variables to woodpecker observations in the Delaware Water Gap National Park. First, the total observation count of all four species across all sites was plotted per year. This makes it possible to provide an understanding of the relative frequencies of woodpecker appearance in the park, as well as see how the population may have changed over time. Next, the explanatory variables were graphed over time. First, the ambient noise levels were plotted by date, to show the relative spread of the levels of noise present when a woodpecker is spotted. To provide further information, a line plot was produced to show the average ambient noise of all observations. Next, we decided to provide an analysis of road noise observations over time. The presence or absence of road noise was provided in the data, and we wished to explore the correlation between that and woodpecker appearance. As such, a plot was created comparing the presence or absence of road noise per year and by species. The final explanatory variable, hemlock condition, was graphed as an average value over time and by species. As a discrete value that is assigned per observation, we decided to see if there was differences each year in the average score, in order to see how it related to woodpecker appearance.

5.1) Graphical Visualizations


**Figure 1.** Line Plot of Species Appearance Over Time

Figure 1. Line Plot of Species Appearance Over Time


Figure 1 graphs the total number of observations of each woodpecker species per year.


**Figure 2.** Scatter Plot of Ambient Noise Over Time

Figure 2. Scatter Plot of Ambient Noise Over Time


Figure 2 plots all ambient noise values across all sites over time. Values are in decibels.


**Figure 3.** Line Plot of Average Ambient Noise

Figure 3. Line Plot of Average Ambient Noise


Figure 3 graphs the average ambient noise per each year over time, with separate lines for each of the four species.


**Figure 4.** Road Noise Presence or Absence Across All Sites

Figure 4. Road Noise Presence or Absence Across All Sites


Figure 4 graphs all sites in regards to the presence of road noise, with different breaks in the bar representing each of the four woodpecker species.


**Figure 5.** Scatter Plot of Hemlock Condition Over Time

Figure 5. Scatter Plot of Hemlock Condition Over Time


Figure 5 represents the average hemlock condition score across all sites per year, by each of the four species.


**Figure 6.** Comparison of Hemlock Condition and Detection Count

Figure 6. Comparison of Hemlock Condition and Detection Count


Figure 6 compares hemlock condition score and the number of detections of each of the four woodpecker species. Visually, a score of three is the most common observation.


5.2) Mapping Visualizations


After plotting woodpecker appearance and the explanatory variables over time, our objective was to show the observation sites and frequency of woodpecker observations spatially. To begin, we have provided a mapview object with two layers, one with every observation site in the data, and the other showing each site with a size that varies based on the number of observations recorded at that site. Next, we wanted to provide readers with an understanding of how the explanatory variables differ spatially. As such, we created another mapview object, with layers representing average ambient noise per site, sites with road noise, sites without road noise, and the average hemlock score per site.

Map 1. Map of All Monitoring Sites in the Delaware Water Gap National Recreation Area


Map 2. Map of Observations of the Four Woodpecker Species in the Delaware Water Gap Recreation Area


Map 3. Map of Average Ambient Noise Levels (dB) and Average Hemlock Score Across all Sites


Map 4. Map of Sites with Observations of Road Noise and Without Road Noise


6) Analysis


With the data wrangling and exploratory analysis complete, we can now run some statistical analysis on each of the explanatory variables to see if they have any affect on woodpecker sightings. The explanatory variables are different from each other and need to be analyzed in different ways due to their nature. To measure if hemlock conditions influence woodpecker sightings, we ran a one-way ANOVA test because the hemlock data is categorical with more than two conditions. We also ran a post-hoc test using Tukey Honest Significant Difference test to see if any hemlock condition guarantees a certain level of woodpecker sighting. To analyze the impact of ambient noise, we used a singular linear regression because ambient noise is a single continuous explanatory variable. We also graphed the results with a lm line of best fit to illustrate how ambient noise and woodpecker sightings are related. The way we analyzed how road noise impacted woodpecker sightings was by running a t-test because it is the only explanatory variable with only two categories. To visualize the affects of road noise on woodpecker sightings we also graphed sighting occurances based on if road noise was present or not. Once we run these tests we will be able to determine if these variables have any impact of woodpecker sightings.


Figure 7.
This is because the p-value is 0.0236 which is below the 0.05 threshold. However, when we run a HSD test on the data, there is no significant combination of hemlock condition and woodpeckers spotted that is significant, all of their p-values were above 0.05.


**Figure 8.** Woodpecker Sightings Based on Ambient Noise Level

Figure 8. Woodpecker Sightings Based on Ambient Noise Level


We can see that our p value is 0.04804 which is below 0.05, meaning that we reject the null hypothesis and accept the alternative. The alternative hypothesis states that the true correlation between our variables is not equal to 0 which is true, the estimated correlation is -0.08991391.


**Figure 9.** Woodpecker Sightings Based on Road Noise

Figure 9. Woodpecker Sightings Based on Road Noise


The p-value of our t-test was 0.9768 which is well above the threshold of 0.05. The mean of the group with road noise was 1.793478 and the mean of the group without road noise was 1.798206.


7) Summary and Conclusions

While we learned much from our study and are ready to answer our questions, we must acknowledge some discrepancy or potential knowledge gaps in our data and its collection. The data was only collected in the summertime and with site overlap. So we are provided only a small window into what woodpecker populations look like in the Delaware Water Gap National Recreation Area when they are present year round. Along with many sites overlapping, the data collection was also totally reliant on the observers seeing or hearing the birds and birds could have been at the site but not seen by the observer. The data collection team also had a limited staff so they were not able to observe every site simultaneously. All of these factors could have lead to sightings being missed or misrepresented in the data. Another factor that could have caused discrepancy is that road noise could be a part of ambient noise at sites. It is not specified if road noise is excluded from ambient noise levels.

From the data, we can see that the downy woodpecker is the most common woodpecker species in the area, with hairy, pileated, and red-bellied appearing in slightly lesser levels. There did not appear to be a woodpecker species that appeared at different ambient noise levels, they all appear at relatively the same ambient noise average as every other species every. There was no observable trend and no species was dominant at different ambient noise levels. There also does not appear to be an observable visual trend in road noise presence in woodpecker sightings. There was a difference in the number of observations that had road noise presence and those that did not. However this could have been attributed to the lack of roads in the near vicinity of many sites. Our graphs determined that there is a observable trend for the preferred hemlock condition. Woodpeckers prefer a hemlock score of 3. A hemlock score of this level indicates that woodpeckers prefer hemlocks with green and living branches near the crown and dead branches near the base. These conditions seem to be the most ideal for every woodpecker species.

There were also observable trends when we examined the spatial extent of woodpecker observations. The downy woodpecker mainly occupies the northern part of the recreation area. The hairy woodpecker seemed to occupy the north and middle of the area. The Red-bellied Woodpecker was seen more in the southern and middle sections of the area. While the pileated woodpecker was present at similar levels all over the area. The sites with the most woodpecker observations are along the middle and the north of the recreation area, the southern areas do not see as many woodpecker sights. Ambient noise levels seem to be fairly even across the whole recreation area, with areas closer to roads and the airport having higher average ambient noise levels. There is concentrations of site with road noise, they appear to be occurring in several small collections along the highway that cuts through the recreation area. Hemlock condition scores also seem to be fairly evenly distributed across the recreation area. There is a small concentration of poor hemlock condition score within the actual Delaware Water Gap, but outside of that, there is no other concentration of any specific score.

When we examined our data statistically, we were able to draw conclusions about what explanatory variable is significant to woodpecker sightings. When we ran our ANOVA test for hemlock condition score it returned a p value below 0.05, which proves that hemlock condition score is an explanatory variable. The results of our singular linear regression indicated a p value below 0.05, which means we must accept the alternative hypothesis. The alternative hypothesis states that the correlation between woodpecker sightings and ambient noise levels are correlated inversely. Rising ambient noise levels slightly decrease the number of woodpecker sightings. This is due to the fact that our correlation is not much lower than 0. When we analyzed the effects of road noise on woodpecker sightings, the p-value was well above 0.05. This means that we fail to reject our null hypothesis, meaning there is no difference in the average number of observations between road noise and no road noise groups. The difference between these two groups is incredibly small, further backing up the results of the t-test. Overall the results of this test indicate that there is no correlation between woodpecker sightings and road noise.

For future consideration, we would advise that when ambient noise observations are being made, it would be insightful to specify if road noise was considered a part of the ambient noise level. A larger observation staff to observe the sites year round would also be more beneficial, as all four species are non-migratory. Finally, the site locations could be more evenly distributed across the recreation area, so that they do not overlap.